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[57] ABSTRACT 

A system and method for communicating audio data in a 
packet-based computer network wherein transmission of 
data packets through the computer network requires variable 
periods of transmission time. The system comprises: (1) a 
packet assembly circuit for constructing a data packet from 
a portion of a stream of digital audio data corresponding to 
an audio signal, the packet assembly circuit generating a 
position identifier indicating a temporal position of the 
portion relative to the stream, inserting the position identifier 
into the data packet and queuing the data packet for trans- 
mission through a backbone of the computer network and 
(2) a packet disassembly circuit, having a buffer associated 
therewith, for receiving the data packet from the backbone, 
the packet disassembly circuit inserting the portion into an 
absolute location of the buffer, the position identifier deter- 
mining the location, the portion thereby synchronized with 
adjacent portions of the stream of digital audio data in the 
buffer to compensate for the variable periods of transmission 
time. 

40 Claims, 3 Drawing Sheets 
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SYSTEM AND METHOD FOR 
COMMUNICATION OF AUDIO DATA OVER 
A PACKET-BASED NETWORK 

TECHNICAL FIELD OF THE INVENTION 5 

The present invention is directed, in general, to computer 
networks and, more specifically, to a system and method for 
transmitting and receiving digitized audio data in a packet- 
based computer network to compensate for variable packet 10 
transmission times (jitter). 

BACKGROUND OF THE INVENTION 

Historically, entirely separate communication systems 
have been employed to transmit audio data (sometimes 15 
referred to shorthandedly as "voice" for simplicity's sake) 
and computer data (sometimes abbreviated "data" for a like 
purpose, although it should be understood that "voice" data 
and computer "data" both fall within the broad definition of 
"data" ). 

Over a century ago, analog telephone networks were 
developed to carry analog audio signals. Telephone net- 
works allow communication of audio data, or more broadly, 
audio signals between two or more users by establishing, ^ 
with central switching equipment, a dedicated communica- 
tion circuit or "channel" among the users. Because the 
channel, once established, is dedicated exclusively to trans- 
mission of the users' conversation, the conversation is not 
required to compete for the channel's bandwidth with other 3Q 
conversations. The advantage of having a dedicated channel 
per conversation is that any transmission delays from 
speaker to listener is purely a function of the unfettered 
speed of the audio signal through the telephone network. 
Since this speed does not significandy vary over time, such 35 
dedicated channels are capable of providing "isochronous" 
transmission. Unfortunately, one significant disadvantage of 
dedicated channels is that they require significant band- 
width; that is, the complete bandwidth of the channel 
remains available and dedicated to carriage of the conver- 4(J 
sation even when no audio information is being transmitted. 

In recent years, efforts have been underway to establish 
interface standards for digital transmission of audio signals 
over telephone networks. The most noted of the existing 
standards is the Integrated Services Digital Network 45 
("ISDN") significantly sponsored by AT&T. ISDN standard- 
izes connection interfaces, transmission protocols and ser- 
vices to create a unified digital circuit switching network. 
More recently, recommendations for Broadband ISDN 
C*BISDN") have been adopted Unlike ISDN, which is a 50 
digital network standard, BISDN uses packet relay, or 
Asynchronous Transfer Mode ("ATM") as a transmission 
standard, and is of particular importance in transmission 
over broadband "backbones" and, in particular, fiber optic 
lines. ATM is primarily a connection-oriented technique that 55 
can transport both connection and connectionless-oriented 
services at either a constant bit rate or a variable bit rate. 
ATM provides bandwidth on demand and handles all traffic 
types through fast-packet switching techniques that reduce 
the processing of protocols and uses statistical multiplexing, go 

In ATM, audio data are split into relatively small blocks 
or packets, commonly called "cells." The cells arc individu- 
ally communicated through the ATM network by transmit- 
ters and receivers that arc not synchronized. Networks 
limited to synchronous transmission generally require decti- 65 
cated channels and a clock to control the synchronous 
transmission of audio data through the network. Therefore, 


ATM allows telephone networks to depart from the above- 
described synchronous transmission of audio data over 
dedicated, isochronous channels, thereby dramatically 
increasing network efficiency by combining previously 
dedicated channels and decreasing cost by eliminating syn- 
chronicity. Both ISDN and BISDN therefore hold much 
promise for the future. However, widespread application of 
these standards has been slow, as the installed base of analog 
equipment (including telephone sets) is substantial and 
presents great resistance to change. 

Packet transmission or ATM should not be confused with 
TDM. TDM calls for synchronous division of the overall 
bandwidth of a common backbone into multiple low speed 
channels and assigns a specific time slot to each channel. In 
other words, if there are four channels, each channel is 
allocated a fourth of the bandwidth. The bandwidth is 
systematically switched, such that channel 1 gets its fourth- 
bandwidth, followed by channels 2, 3, 4, 1, 2 and so on. In 
TDM, the processing power necessary to share common 
bandwidth is located in various, centralized multiplexers. 
This centralization is acceptable if channel traffic is constant 
or predictable. However, when traffic occurs in short inter- . 
vals (as in the real world), processing becomes nontrivial, 
resulting in an effective loss of bandwidth. 

In contrast, packet transmission or ATM is asynchronous, 
allocating the total backbone bandwidth on an as-demanded 
basis. For instance, if channel 1 is highly active, it may 
receive more than its pro-rata share of overall bandwidth. 
When channel l's activity declines, its allocated bandwidth 
likewise declines. Thus, packet transmission or ATM is most 
adept at handling "bursty" transmission of data, wherein the 
activity of each individual channel is subject to relatively 
wide variation. Thus, because computers transmit data 
through networks in packets, computer data are said to be 
"bursty." Unlike TDM, the processing power required to 
create, transmit and receive packets is distributed among all 
of the communicating devices, rather than being centralized. 
Thus, bandwidth is not effectively lost due to inherent 
limitations in centralized processing. 

Although telephone networks have been in place for over 
a century, computer networks have come into being only in 
the past quarter century. In contrast with the dedicated 
channels of traditional telephone networks, computer net- 
works allow individual computers shared access to a com- 
mon communication backbone having relatively broad 
bandwidth (in a manner quite similar to ATM). 

As in ATM, computer data are divided into packets, each 
of which includes error protection. The individual net- 
worked computers ("nodes") thus are granted access to the 
complete bandwidth of the backbone so they can transmit 
their packets of computer data thereon. When the transmit- 
ting computer completes transmission of the packet, the 
backbone is made immediately available for the other com- 
puters. 

A special case of a computer network is a personal 
computer ("PC") network.. Whereas PCs were once only, 
used as isolated devices, they are now used for a wide range 
of applications requiring the PCs to communicate with each 
other over a computer network. 

Today, networking in a large office with hundreds of PCs, 
or in a small office with just a few PCs, is very popular and, 
quite simply, the best way to share data and communicate 
among PCs. A local area network ("LAN") is a specific type 
of network connecting PCs located in relatively close prox- 
imity. A wide area network ("WAN") is a network of 
separate LANs. The backbones of such LANs typically 
comprise coaxial or twisted-pair cable. 
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All networks experience delay in end-to-end data trans- 
missions therethrough. This delay (termed "latency") affects 
the overall efficiency and effective bandwidth of the net- 
work. ATM and computer networks, because they are asyn- 
chronous, are further subject to "jitter/' defined as change in 
network latency as a function of time. Jitter is largely 
unpredictable; however, the overall quantity of traffic on a 
network tends to increase both latency and jitter. 

At the heart of any computer network is a communication 
protocol. A protocol is a set of conventions or rules that 
govern the transfer of data between computer devices. The 
simplest protocols define only a hardware configuration, 
while more complex protocols define timing, data formats, 
error detection and correction techniques and software struc- 
tures. 

Computer networks almost universally employ multiple 
layers of protocols. A low-level physical layer protocol 
assures the transmission and reception of a data stream 
between two devices. Data packets are constructed in a data 
link layer. Over the physical layer, a network and transport 
layer protocol governs transmission of data through the 
network, thereby ensuring end-to end reliable data delivery. 

Hie most common physical networking protocol or topol- 
ogy for small networks is Ethernet, developed by Xerox. 
When a node possesses a packet to be transmitted through 
the network, the node monitors the backbone and transmits 
when the backbone becomes clear. There is no central 
backbone master device to grant requests to gain access to 
the backbone. While this type of multipoint topology facili- 
tates rapid transmission of data when the backbone is lightly 
utilized, packet collisions may occur when the backbone is 
heavily utilized. In such circumstances, there is a greater 
chance that multiple nodes will detect that the backbone is 
clear and transmit their packets coincidentally. If packets are 
impaired in a collision, the packets are retransmitted until 
transmission is successful. 

Another conventional physical protocol or topology is 
Token Ring, developed by IBM. This topology employs a 
"token" that is passed unidirectionally from node to node 
around an annular backbone. The node possessing the token 
is granted exclusive access to the backbone for a single 
packet transfer. While this topology reduces data collisions, 
the latency incurred while each node waits for the token 
translates into a slower data transmission rate than Ethernet 
when the network is lightly utilized. 

Several network and transport protocols designed to 
handle bursty data transmission are well known in the art. 
One protocol that enables communication between PCs is 
the Microcom Networking Protocol ("MNP"), developed by 
Microcom Systems. MNP is suited for both interactive 
communication and file transfers and may be implemented 
on a wide range of computers. MNP packets data with a 
header and trailer containing packet type, CRC and other 
information concerning the packet While the MNP protocol 
provides relatively error-free transmission of data, the sig- 
nificant overhead of the header and trailer decreases data 
bandwidth. 

The prior art includes many techniques involving manipu- 
lation of data to boost the data transmission rate or "through- 
put" of a network. U.S. Pat. No. 4,691,314, assigned to 
Microcom, discloses a system for transmitting data in larger, 
adjustable-sized packets. Because the system allows for 
larger packets, relatively less header and trailer overhead is 
required. 

However, when the transmission medium is unreliable 
(such as when the data are transmitted over noisy telephone 
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network lines), errors may occur more frequently in the data. 
As packet length increases, the chance of corruption of the 
data within the packet also increases. Furthermore, the larger 
packets must be retransmitted, thereby decreasing network 
throughput 

Another network and transport protocol is Transmission 
Control Protocol/Internet Protocol ("TCP/IP"). This proto- 
col employs a "go back N method" of error and flow control 
over a datagram network. In a "go back N method" of error 
control, if there is a transmission error, a packet loss, 
excessive latency in the delivery of a packet, delivery of a 
packet out of sequence or an overflow of a receiver buffer, 
significant loss of throughput is realized due to excessive 
packet retransmissions. 

As the domain of digital computer networks continues to 
expand, the networks are challenged with new and more 
difficult responsibilities. One of those challenges is multi- 
media. In recent years, there have been a number of attempts 
to produce a digital data network additionally capable of 
carrying data representing a digitized audio signal (again, 
"voice"), thereby additionally functioning as a telephone 
network and, in sum, yielding a so-called "multimedia 
network." 

As described above, however, audio signals are extremely 
time-sensitive, because users are extremely sensitive to 
minute tones, inflections and pauses, particularly in human 
speech. Thus, a computer data network that also must 
transmit audio data is forced to cope with the communica- 
tion of both bursty computer and time-sensitive audio data 
on the backbone. 

The repercussion is that the above-described data network 
and transport protocols that are sufficient to transmit data are 
insufficient for transmission of time-sensitive audio data. 
The latencies present in a communication network, e.g., 
those relating to coding, packet assembly, media access, 
propagation, receiver buffering and decoding, must be pre- 
cisely compensated for to preserve the fidelity of the audio 
signal. 

At this point, an interesting observation should be made. 
Data has been described above as being bursty. It has been 
implied that audio data is somehow not. Both of these 
assumptions prove to be inaccurate. First, data is only bursty 
because computer networks have been dealing with it in that 
manner for so many years. In fact, once transmission of a 
batch of data begins, data transmission rate is constant 
Second, because spoken words are made of small, discrete 
utterances (syllables or words), audio data is inherently 
bursty. Therefore, while it is certainly true that audio data is 
extremely time-sensitive, audio data is likewise bursty. If a 
way can be found to compensate for network jitter, audio 
data should be highly amenable to packet-based transmis- 
sion. 

Therefore, what is needed in the art is a system and 
method for transmitting and receiving digitized audio data in 
a packet-based network to adjust for variable packet trans- 
mission times. The system and method must deliver end- 
to-end reliable transmission of data, accounting for all 
delays in the transmission network while presenting high 
fidelity audio signals at the receiving end. 

SUMMARY OF THE INVENTION 

To address the above-discussed deficiencies of the prior 
art, it is a primary object of the present invention to 
compensate for jitter in a computer network to provide high 
fidelity transmission of audio data through the network. 
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In the attainment of the above primary object, the present 
invention provides a system and method for communicating 
audio data in a packet-based computer network wherein 
transmission of data packets through the computer network 
requires variable periods of transmission time. Hie system 5 
comprises a packet assembly circuit for constructing a data 
packet from a portion of a stream of digital audio data 
corresponding to an audio signal. The packet assembly 
circuit generates a position identifier indicating a temporal 
position of the portion relative to the stream, inserting the 10 
position identifier into the data packet and queuing the data 
packet for transmission through a backbone of the computer 
network. 

The system further comprises a packet disassembly cir- 
cuit, having a buffer associated therewith, for receiving the 15 
data packet from the backbone. The packet disassembly 
circuit inserts the portion into an absolute location of the 
buffer, the position identifier determining the location, the 
portion thereby synchronized with adjacent portions of the 
stream of digital audio data in the buffer to compensate for 20 
the variable periods of transmission time. 

Transmission of audio data over a computer network is a 
more exacting task than transmission of less time-sensitive 
computer data. As previously described, audio data are 
extremely time sensitive; and as a result, the system hard- 25 
ware, software and transport protocol must be precisely 
coordinated to realign the audio data at the receiving end. 
The present invention provides such a system and method 
for ensuring high fidelity and clear transmission of audio 
data through a computer network. 30 

The position identifier of the present invention should not 
be confused with a packet sequence number. As will be 
described in more detail, the position identifier points to a 
specific, absolute address in the buffer and not to a position 
of the packet relative to other packets. With sequence 35 
numbers, one may only discern that packet 3 follows packet 
2 and precedes packet 4. With the position identifier, one 
may further discern vital packet synchronization informa- 
tion: that packet 3 follows packet 2 by, e.g., 5 milliseconds 
("ms") and precedes packet 4 by, e.g., 15 ms. In distinct 40 
contrast to sequence numbers, position identifiers may cause 
portions of packets to occlude (and therefore overwrite) 
portions of other packets, may result in temporal gaps 
between packets (resulting in interstitial periods of silence) 
and allow packets to be transmitted in an arbitrary order 45 
without compromising relative packet synchronization. 

In a preferred embodiment of the present invention, the 
system further comprises an interpolation circuit for insert- 
ing synthesized audio data into a designated location of the 5Q 
buffer to thereby lengthen the portions of the stream of audio 
data in the buffer. The interpolation circuit addresses those 
circumstances in which the length of the buffer decreases 
during reception of audio data from the backbone. This 
happens when data are read from the buffer faster than they S5 
are written to the buffer. 

For example, if the clock of a coder/decoder ("CODEC) 
that reads from the buffer is too fast, the CODEC reads too 
rapidly and the buffer becomes too short The interpolation 
circuit is adapted to detect when the buffer is too short and go 
adjust the buffer toward a predetermined length by adding 
the synthesized audio data. The interpolation circuit ensures 
that buffer stays close to its r^edeterrnined length for effi- 
cient realignment of the audio data in the buffer. 

The system of the present invention further comprises a 65 
decimation circuit for deleting audio data from a designated 
location of the buffer to thereby shorten the portions of the 


stream of audio data in the buffer. The decimation circuit 
addresses the circumstance in which the length of the buffer 
increases during reception of audio data from the backbone. 
This happens when data are read from the buffer slower than 
they are written to the buffer. 

For example, if the CODEC clock triggers too slowly, or 
if the audio data are transmitted at an excessive rate through 
the LAN, the buffer window lengthens. The decimation 
circuit is adapted to detect when the buffer is too long and 
adjust the buffer toward its predetennined length by deleting 
selected audio data. like the interpolation circuit, the deci- 
mation circuit ensures that buffer stays close to its prede- 
termined length for efficient realignment of the audio data in 
the buffer. 

In a preferred embodiment of the present invention, the 
data packet of the present invention comprises source and 
destination fields for determining a transmission route of the 
data packet through the computer network. This embodi- 
ment is primarily directed to an Ethernet environment, 
wherein each node in the computer network is designated by 
a specific address. Prior to routing the audio data across the 
backbone of the computer network, the data packet is 
assigned a source and destination address designating the 
appropriate nodes. Alternatively, a channel identifier may be 
used in WAN applications (via ATM) to ensure accurate 
delivery. 

As previously described, packet-based transmission 
allows advantageous distributed call processing and signal- 
ling. Thus, each packet assembly circuit is individually 
responsible for determining the routing of the audio data 
through the network. 

In a preferred embodiment of the present invention, a 
value of the position identifier is a function of a length of a 
portion of the stream of digital audio data in a previously- 
transmitted data packet Thus, the position identifier prefer- 
ably designates the position at which the first datum of each 
portion is to be placed in the buffer. That position preferably 
follows the position of the last datum of the previously- 
transmitted data packet 

In a preferred embodiment of the present invention, each 
portion of audio data (a "sample") is placed in a data packet 
having a prescribed length. In addition to the sample, the 
data packet contains a position identifier. Hie position 
identifier directs the samples into absolute positions in the 
buffer, that may or may not be successive. The distinct 
advantage of the position identifier is temporal synchroni- 
zation of samples in the buffer. 

It should also be understood that other than audio data can 
occupy the data packet Given a special header designation, 
signalling and call processing (control) data can be loaded 
into a packet Again, this allows for distributed, decentral- 
ized processing. Once loaded into a packet, the control data 
is treated no differently than audio data in its travels through 
the network. 

In a preferred embodiment of the present invention, a 
length of a travelling window within the buffer of the present 
invention is about 20 ms. The window is defined as the 
difference between the locations at which data are written to 
and read from the buffer. The window is established at that 
optimal length (in an Ethernet application) as a function of 
packet length and network characteristics (such as latency in 
packet assembly, media access, transmission and disassem- 
bly). In an ATM network, window length should also be 
about 20 ms. With the Internet, window length should be 
about 50-100 ms to account for significant latency in that 
very large network. In each case, if the window were to be 
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shorter, there may not be sufficient time to allow for the 
latency. Echo cancellation is typically a requirement when 
the round trip audio delay exceeds 60 ms. 

In-^prefcrred embodirng^or^tltc piesenrinventionrthej^ 
datg-packeHis-cap abfe- of^ni^ning^ponio n-rrav mg-a^ 5 
lengtff^f^ 
cSponaVtoia^rbytej^ 
foJ^ampte:~Again^^ 
derjenffe^ponnetworkcharaxt^ to 
<rwr^on:isii^bfl^§^L^Since many co'mpression^aI|Oi 
rithmsrare:variable7m^^ 

In a preferred embodiment of the present invention, the 
system further comprises a digital conversion/compression 
circuit, coupled to the packet assembly circuit, for digitizing 
and compressing the audio signal into the stream of digital 
audio data. Again, many compression algorithms are vari- 
able, so there is not a linear correspondence between uncom- 
pressed and compressed data length. 

The digital conversion/compression circuit converts the 
analog audio signal into a stream of digital audio data for use 
by the packet assembly circuit. The packet assembly circuit 
arranges the audio data into data packets for transmission 
across the backbone. The advantage of digitizing and com- 
pressing the data is that larger effective bandwidth is thereby 
available for transporting audio data through the computer 25 
network. 

In a preferred embodiment of the present invention, the 
system further comprises a decompression/analog conver- 
sion circuit, coupled to the packet disassembly circuit, for 
decompressing and converting the stream of digital audio 
data back into the audio signal. Thus, the received audio data 
are converted into a medium that the listener on the receiv- 
ing end can understand and respond to in kind. 

In a preferred embodiment of the present invention, the 
computer network of the present invention comprises a 
plurality of computers coupled to the backbone, the packet 
assembly circuit and the packet disassembly circuit located 
in separate ones of the computers. Thus, present invention is 
designed to operate in a computer network having a plurality 
of nodes and able to support many ongoing telephone 
conversations. The computer network may be of a client- 
server or peer-peer topology. Thus, the system of the present 
invention allows a computer network to supplant a private 
branch exchange ("PBX") system. PBXs are highly propri- 
etary, expensive and relatively inflexible. 

In a preferred embodiment of the present invention, the 
packet assembly circuit and the packet disassembly circuit 
are embodied in preprogrammed general purpose data pro- 
cessing and storage circuitry. Those of skill in the art will 
recognize that, while the system of the present invention 
may be embodied in discrete circuitry, microprocessor- 
based integrated circuits provide an attractive and flexible 
environment for embodiment of the system. 

The foregoing has outlined rather broadly the features and 
technical advantages of the present invention so that those 
skilled in the art may better understand the detailed descrip- 
tion of the invention that follows. Additional features and 
advantages of the invention will be described hereinafter 
that form the subject of the claims of the invention. Those 
skilled in the art should appreciate that they may readily use 
the conception and the specific embodiment disclosed as a 
basis for modifying or designing other structures for carry- 
ing out the same purposes of the present invention. Those 
skilled in the art should also realize that such equivalent 
assemblies do not depart from the spirit and scope of the 
invention in its broadest form. 
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For a more complete understanding of the present inven- 
tion, and the advantages thereof, reference is now made to 
the following descriptions taken in conjunction with the 
accompanying drawings, in which: 

FIG. 1 illustrates a computer network that forms an 
environment within which the present invention can operate; 

FIG. 2 illustrates a block diagram of a microprocessor- 
based system constructed in accordance with the present 
invention; 

FIG. 3 illustrates an Ethernet data packet of audio data 
assembled according to the present invention; 

FIG. 4 illustrates an ATM data cell of audio data 
assembled according to the present invention; 

FIG. 5 illustrates the operation of the buffer of the present 
invention; 

FIG. 6 illustrates a flow diagram of the method of 
assembling a data packet according to the present invention; 
and 

FIG. 7 illustrates a flow diagram of the method of 
disassembling a data packet according to the present inven- 
tion. 


DETAILED DESCRIPTION 

Referring initially to FIG. 1, illustrated is a computer 
network, generally designated 100, that forms an environ- 
ment within which the present invention can operate. The 
network 100 is illustrated as including a telephone instru- 
ment 110 coupled, via a PC 120 having a display screen 124, 
to an Ethernet-type computer network backbone 130. Other 
telephone instruments 112, 114 may be coupled to the 
backbone 130 via a multiple station card 122. The present 
invention is capable of transmitting audio signals among the 
telephone instruments 110, 112, 114 via the Ethernet back- 
bone 130. 

The present invention is compatible with various physical 
layer protocols. The Ethernet backbone 130 is linked 
through an Ethernet Switch 140 and an ATM hub 150 to a 
Token Ring backbone 172 of a Token Ring LAN 170. The 
Token Ring backbone 172 is coupled, via a PC 176 having 
a display screen 178, to a telephone instrument 174. The 
ATM hub 150 is coupled, via a PC 154 to a display screen 
156, to a telephone instrument 154. Packetized computer 
data transmitted across the Ethernet backbone 130 is 
switched through the Ethernet switch 140 to the ATM hub 
150. Packetized computer data transmitted across the Token 
Ring backbone 172 is routed directly through the ATM hub 
150. Again, the present invention is fully ATM-compatible, 
thereby allowing full access to ATM resources via the ATM 
hub 150. 

A telephone server 160 is connected to a plurality of 
telephone instruments 162, 164 and connected, via the 
Ethernet Switch 140, to the Ethernet backbone 130. The 
telephone server 160 is also connected through the ATM hub 
150. Audio data from the Ethernet backbone 130 is directed 
through the telephone server 160, via the Ethernet switch 
140, to the ATM hub 150. The telephone server 160 provides 
full ISDN communication to central office ("CO") trunk 
lines 166, thereby allowing WAN via ATM. 

Again, the present invention provides a system and 
method for communicating audio data in the packet-based 
computer network 100 wherein transmission of data packets 
through the computer network 100 requires variable periods 
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of transmission time. The present invention is designed to 
operate in a distributed architecture network 100 with com- 
ponents as herein described. 

The telephone instruments 110, 112, 114, 162, 154, 164, 
174 may be traditional analog instruments, but it is within 5 
the scope of the present invention that they be ISDN- 
compatible or other digital instruments. The PCs 120, 154, 
174 are illustrated as being conventional PCs having an 
expansion or input/output ("I/O") bus preferably adhering to 
the Industry Standard Architecture ("ISA") or Extended 10 
Industry-Standard Architecture ("EISA"). Those of skill in 
the art will understand that the present invention is not 
limited to a particular hardware architecture. As will be 
described with reference to FIG. 2, the I/O bus provides an 
interface by which the system of the present invention 15 
allows communication between the backbones 130, 170 and 
the hub 150 and the corresponding PCs 120, 154, 174. 

The PC 120 includes a display screen 124 that is capable 
of displaying, under software control, data pertaining to 
operation of the system. This allows a user to use the display 20 
screen 124 for visual access to phone features through 
processing and interface capabilities, such as those provided 
in Telephony Application Programmers Interface (*TAPI"), 
developed by Intel and Microsoft or Telephony Services 
Application Prograrnmers Interface OTSAPI"), developed 25 
by Novell and AT&T. The backbone 130 is a conventional 
Ethernet backbone comprising multiple parallel conductors 
that act as paths along which data are transferred among 
nodes of the computer network 100. 

The ATM hub 150 is an interface card that converts 30 
Ethernet or Token Ring packet formats to ATM cell formats. 
Hie Ethernet packet to ATM cell conversion is discussed in 
reference to FIG. 4. The ATM hub 150 provides the previ- 
ously-described interface between the Ethernet or Token 
Ring network and an ATM-switched network. 35 

In the illustrated embodiment, the telephone server 160 
multiplexes signals from dedicated telephones 162, 164 and 
audio data from the backbone 130 of the Ethernet physical 
protocol layer, thereby providing digital service of audio 
data. 40 

T\irning now to FIG. 2, illustrated is a block diagram of 
a microprocessor-based system constructed in accordance 
with the present invention. The microprocessor-based con- 
troller comprises a microprocessor 210, a digital signal 
processor ("DSP") 220, a CODEC 230, a telephone set 45 
interface ('TSI") 240, aTSI connector 242, random-access 
memory C*RAM") 250, an Ethernet controller 260, an 
Ethernet controller interface connector 262, a dual port 
memory 270, and a dual port memory interface connector 
272. 

The illustrated, embodiment provides standard telephone 
instrument 110 connectivity into the PC 120 through the TSI 
240 and TSI connector 242. The TSI 240 accepts an analog 
signal from the telephone instrument 110. The TSI connector 55 
242 is preferably a standard RJ-11 connector. 

The illustrated embodiment also provides connectivity to 
the backbone 130 through the Ethernet controller 260 and 
Ethernet controller interface connector 262. The Ethernet 
controller 260 transmits data to, and receives data from, the 50 
backbone 130. The Ethernet controller interface connector 
262 is preferably a standard RJ-45 connector. The Ethernet 
controller 260 is internally connected to the processor 210 
and RAM 250 by an internal local bus 265. 

The TSI 240 is coupled to the CODEC 230. The CODEC 65 
230 provides the analog-to-digital and digital-to-analog con- 
version for the audio data. The CODEC 230 comprises a 


50 


digital conversion/compression circuit for digitizing and 
compressing the audio signal into the stream of digital audio 
data. Those of ordinary skill in the art should understand that 
the present invention does not depend upon application of a 
particular compression/decompression algorithm, or upon 
whether the data are even compressed at all. The sampling 
and compression schemes described herein are for illustra- 
tion only. 

When the telephone instrument 110 transmits an analog 
audio signal to the CODEC 230, the CODEC 230 samples 
the signal at a predetermined, conventional rate of 8kHz. 
The CODEC 230 then preferably employs a known, stan- 
dard logarithmic compression method (such as A-Law or 
u-Law) to compress a 13 or 14 bit wide data sample into an 
8 bit compressed sample. The CODEC 230 further com- 
prises a decompression/analog conversion circuit for 
decompressing and converting the stream of digital audio 
data back into the audio signal. The decompression circuit 
restores the 8 bit compressed sample into a decompressed 13 
or 14 bit sample and converts the sample into an analog 
voltage for reproduction in the telephone instrument 110. 
Finally, the CODEC 230 has an associated clock (not 
illustrated) that governs the pace of the CODEC'S operation. 

The DSP 220 analyzes, filters and enhances audio data 
from the CODEC 230. The DSP 220 may also provide echo 
cancellation or compression/decompression in lieu of the 
CODEC 230. Echo cancellation is typically a requirement 
when the round trip audio delay exceeds 60 ms. 

The processor 210 is charged with the responsibility of 
compiling the information from the DSP 220 and Ethernet 
controller 260 and performing the operations required to 
transmit the data. The processor 210 therefore embodies the 
packet assembly circuit and the packet disassembly circuit. 
As stated above, the packet assembly circuit generates a 
position identifier 370 that indicates a temporal position of 
the portion relative to the stream, inserts the position iden- 
tifier 370 into the data packet and queues the data packet in 
the Ethernet controller for transmission through the Ethernet 
backbone 130. 

The RAM 250 preferably contains a receiving buffer 510 
according to the present invention. It will be recalled that the 
receiving buffer 510 is associated with the packet disassem- 
bly circuit and provides the environment within which 
portions of audio data are reassembled 

The processor 210 further embodies an interpolation 
circuit for inserting synthesized audio data into a designated 
location of the receiving buffer 510 to thereby lengthen the 
portions of the stream of audio data in the receiving buffer 
510 and a decimation circuit for deleting audio data from a 
designated location of the receiving buffer 510 to thereby 
shorten the portions of the stream of audio data in the 
receiving buffer 510. 

Access between the dual port memory 270 and the I/O bus 
280 of the PC 120 is provided through the dual port memory 
connector 272. The dual port memory 270 provides storage 
capacity and overflow back-up in facilitating communica- 
tion between the internal local bus 265 and the I/O bus 280. 
Digital data from the Ethernet controller 260 and the pro- 
cessor 210 can be stored in the dual port memory 270. 

At this point, it should be stated that the present invention 
is ultimately directed to application in an ATM environment 
It has been stated previously that ATM does not currently 
enjoy wide acceptance. However, this is changing. Thus, 
with respect to the embodiments disclosed herein, a two-part 
description will be undertaken. In FIG. 3, the present inven- 
tion will be described as applied in the currently-popular 
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Ethernet environment In FIG. 4, the present invention will 
be described as applied in ATM, its eventual preferred 
environment 

^JIbniing^ew^o-F}Gr^riilustrat£^4s~an-EthemeUdata_ 
packet of au dio d aja.assembled accordi ng jojhe present/5 
invenndh. THepreferred embodi ment demonstrates the com- 7 
riadliility^f4he.presem invention with ^Ethernet Hirame' 
having"TTotal~lengm-of c 74-byfes. AlbWframe~size~of 72? 
/bvtes-is-4rie, minimum sized^frame-aUowed-byJEtherner/ 
/IIlustmted'aTe an^thernet II header 310, a message 330 and^ 10 

Lan-EmemeLlI-trailer-390: ■ / 

The Ethernet II header 310 comprises an Ethernet pre- 
amble 313, an Ethernet Start Frame Delimiter ("SFD") 316, 
a destination address 319, a source address 322 and a type 
field 325. The Ethernet preamble 313 is a 7 byte series that 15 
provides timing synchronization for the receivers. The Eth- 
ernet SFD 316 is a 1 byte address that separates data at the 
input of the computer. Hie type field 325 denotes the 
upper-layer protocol that is using the data packet 

The Ethernet II header 310 further comprises the desli- 20 
nation address 319 and source address 321 for determining 
a transmission route of the data packet through the computer 
network. Prior to transmitting the audio data across the 
backbone 130 of the computer network 100 of FIG. 1, the 
data packet is assigned the destination address 319 and 25 
source address 322. Each individual node in the computer 
network is designated by a specific address. To ensure that 
each individual data packet is routed to the proper destina- 
tion, the Ethernet II header 310 of each data packet is 
assigned a respective destination address 319 and source 30 
address 322. Consequently, the data travels between respec- 
tive locations. 

In particular, the destination address 319 marks the des- 
tination field that the data packet will be sent in the computer 
network. The source address 322 is the address of the station 35 
in the computer network that sent the data packet. Both the 
destination address 319 and the source address 322 are 6 
bytes long. 

Hie Ethernet II trailer 390 comprises a Frame Check 40 
Sequence ("FCS") field 395. The FCS field 395 is an 
error-checking device built into each data packet to ensure 
that only valid frames are processed by the receiving station. 
The FCS field 395 contains a 4 byte CRC value. A CRC 
validation is performed by the transmitting stations before 45 
sending the data packet. The receiving station performs the 
same CRC validation, matching the resulting value against 
the contents of the FCS field. If the numbers match, the data 
packet is assumed to be valid, if not, the packet is disre- 
garded. 50 

The message 330 of the data packet has a maximum 
length of 48 bytes. The message 330 is comprised of a 
reserved/length field 340 (optional, and employed with 
variable-length audio data packets), a reserved/CRC field 
350, a channel identifier 360, a position identifier 370, and 55 
a audio data sample 380. 

The reserved/length field 340 is 1 byte long and specifies 
the number of bytes contained between the reserved/length 
field 340 and the last byte in the audio data sample 380. The 
reserved/CRC field 350 is a 1 byte field reserved for error 60 
checking purposes in an ATM cell. The channel identifier 
360 is a 1 byte field that identifies the message 330 as a 
packet of control data (perhaps containing signalling com- 
mands) if the channel identifier 360 is equal to 255 otherwise 
it represents the audio data of a specific station. The channel 65 
identifier allows multiple voice connections on a single real 
channel to save switching complexity within the data net- 
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work. It also allows voice conferencing on shared media 
without additional dedicated bandwidth. The channel iden- 
tifier is also used in a call setup sequence to allow multiple 
conversations between two voice server devices, thereby 
suitable for ATM transport 

The position identifier 370 is a pointer representing the 
newest audio sample 380. Tne position identifier 370 is a 1 
byte long pointer to 4 byte words of the audio sample 380 
and can represent 256x4 bytes (1 kilobyte) before it over- 
flows and wraps. Since digitized audio typically uses a 
standard 8kHz sampling rate (125 microseconds between 
samples), 256x4x125 microseconds is the total time that the 
position identifier 370 can represent before wrapping. The 
position identifier 370 is used both when the channel iden- 
tifier 360 represents audio data and when the channel 
identifier 360 represents control data (such as signalling or 
call processing). For example, when the channel identifier 
360 equals 255 then the position identifier 370 is used to 
represent a signalling data message type. 

Finally, the message 330 of the data packet contains up to 
44 bytes of digitized audio data samples 380. The audio data 
samples 380 contain digitized audio data if the channel 
identifier 360 is a value other than "255." The audio data 
sample 380 contains system commands if the channel iden- 
tifier equals "255." The commands may be, for example, 
information blocks used to set up, take down, forward and 
conference telephone calls. 

The present invention is designed to handle data packets 
of variable-size, to manage variable time transmission of 
data and to increase the throughput efficiency of data across 
the backbone 130 of the computer network. This attribute is 
extremely important to transmitting time-sensitive audio 
data to achieve high audio fidelity. 

Turning now to FIG. 4, illustrated is an ATM data cell of 
audio data assembled according to the present invention. 
The preferred embodiment demonstrates the compatibility 
of the present invention with an ATM cell having a total, 
fixed length of 53 bytes. Hie cell is characterized by an ATM 
header 410 preceding a message (the message 330 of FIG. 
3). 

ATM combines the benefits of both circuit switching and 
cell switching by providing multiple switched virtual circuit 
connections to users through a single access to a network. 
The ATM header 410 contains information specifying the 
virtual path (a Virtual Path Identifier ("VPI") 430) and 
virtual channel (Virtual Channel Identifier ("VCI") 440) of 
the celt The VPI 430 and VCI 440 together establish a 
node-to-node communications channel. Switch routing is 
based on the VPI 430 and VCI 440. The ATM switch 
requires a connection to be established between the incom- 
ing and outgoing virtual channels before information can be 
routed through the switch. The ATM switch then switches 
and routes each individual cell from the incoming multi- 
plexed cell stream to the outgoing multiplexed cell stream 
based upon the virtual channels identified within the ATM 
header 410. In this context, ATM is truly seen as a connec- 
tion-oriented technology. The ATM switch maintains cell 
sequence; and each cell is switched at the cell rate, not the 
channel rate, to accommodate for variable bit rate transmis- 
sions. 

A Cell Loss Priority Field ("CLP") 460 within the ATM 
header 410 establishes priority on the network. There are 
two levels of semantic priority that allows users or network 
providers to choose which cells to discard during periods of 
network congestion. The types are defined by a "1" or "0" 
in the CLP 460 within the ATM header 410. During periods 
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of congestion, the CLP 460 determines which information 
will be discarded or switched through the network. 

The Payload Type Indicator CTTT) 450 in the ATM 
header 410 discriminates between a cell carrying user infor- 
mation (such as audio data) or service information (such as 5 
control data) in the message field 330. The Header Error 
Control field ("HEC") 470 provides error checking of the 
ATM header 410. 

The Generic Flow Control field ("GFC") 420 of the ATM 
header 410 is designed to provide shared public access 10 
similar to the functionality of a Metropolitan Area Network 
OMAN"). GFC 420 is used when there is a single user 
access point servicing multiple terminal interfaces, such as 
those found in a LAN environment Each terminal must 
receive equal access to the network facilities, and the GFC 15 
420 ensures that each terminal will get equal access to the 
shared network bandwidth. The GFC 420 will manage the 
various LAN topologies and architectures. 

The six fields are positioned within the 5 byte ATM header 
410 at address locations as displayed in the illustrated 20 
embodiment. Distinct from an Ethernet data packet, the 
ATM cell transmits information through the network intact 
with no error checking or correction performed on the 
message field 330. The reserved/CRC field 350 is reserved 
to perform error checking on the channel identifier 360, the 
position identifier 370 and the audio data sample 380 in an 
ATM cell at the receiving end. The message field 330 and 
contents therein are as described in relation to the corre- 
sponding portions of the Ethernet data packet previously 
described in conjunction with FIG. 3. Translation between 
an Ethernet data packet and an ATM cell is completed by 
stripping the destination address 319 and source address 321 
from the message field 330 and converting the source and 
destination addresses 319, 321 to the VPI 430, VCI 440 and 
channel identifier 360 associated with the ATM cell. 

Turning now to FIG. 5, illustrated is the operation of the 
receiving buffer 510 of the present invention. As previously 
discussed, the system is comprised of a packet disassembly 
circuit, having the receiving buffer 510 located in the RAM 
250 associated therewith, for receiving the audio data 40 
sample 380 from the backbone 130. The packet disassembly 
circuit inserts the portion into an absolute location of the 
receiving buffer 510, the position identifier 370 detenmning 
the location. The audio data sample 380 is thereby synchro- 
nized with adjacent audio data samples 380 in the receiving 45 
buffer 510 to compensate for the variable periods of trans- 
mission time. The CODEC reads from the receiving buffer, 
lagging the audio data samples, as they are inserted, by some 
period of time (20 ms in the illustrated embodiment), 
thereby creating a travelling window in the receiving buffer 50 
510 of 20 ms delay. Since the receiving buffer is of a 
physical finite length (about 1 kilobyte in the preferred 
embodiment), the window "wraps around" the addresses of 
the receiving buffer 510. Thus, at any given addressable 
location within the receiving buffer 510 data are first written 55 
to the location, then read from, then written to again, and so 
on. The receiving buffer 510 therefore acts as a fixed-delay 
playback buffer. 

Again, in the illustrated embodiment, the length of the 
window in the receiving buffer 510 is about 20 ms. The 60 
window is software setable at that value to account for jitter 
in the transmission network, and packetizing and depack- 
etizing delay. The jitter in the network is primarily due to 
data traffic congestion. The pre-set length of the window 
more than adequately accommodates a data packet and any 65 
inherent system delays in reconstructing the audio data at the 
receiving end. 


25 


30 


35 


As previously mentioned, the CODEC 230 reads from the 
receiving buffer 510 at a rate ideally equal to that at which 
audio data are added, thereby maintaining window length. 
As data are read, the data are replaced with white noise data, 
representing silence. If the white noise data are not subse- 
quently overwritten with received audio data in a subsequent 
pass through the receiving buffer 510, the CODEC 230 reads 
and decompresses the white noise data instead, producing a 
synthesized near-silence for the benefit of the listener in lieu 
of audio data. 

FIG. 5 specifically illustrates 6 audio data samples 380 of 
various sizes and variable transmission delays being placed 
into the receiving buffer 510 as a function of the position 
identifier 370 contained in each data packet. A value of the 
position identifier 370 may be a function of a length of audio 
data sample 380 in a previously-transmitted data packet but 
is not constrained thereby. The position identifier 370 directs 
each audio data sample 380 into specified absolute positions 
of the receiving buffer 510 at the receiving end. Thus, the 
position identifier 370 is fundamentally different from a 
packet sequence number. 

FIG. 5, in conjunction with the following Table I, illus- 
trates insertion of audio data samples into the receiving 
buffer 510 according to the present invention. 

TABLE I 


Audio 

Audio Data Delay of 
Data Sample each 

Sample Size packet 
Number (bytes) (ms) 


Buffer 

Position CODEC Read Length 
Identifier Offset (ms) 


1 

44 

0 

29 

0 

20 

2 

44 

1 

40 

13 

19 

3 

44 

10 

51 

42 

10 

4 

44 

4.5 

62 

42 

15.5 

5 

24 

1.5 

73 

42 

18.5 

6 

44 

0 

79 

50 

20 

7 

44 

0 

90 

60 

20.5 


Again, at a sample rate of 8kHz, individual bytes or 
samples occur in 0.125 ms intervals. "Position identifier" 
("PI") locates each temporally successive audio data sample 
380 in an absolute position within the receiving buffer 510. 
The PI is divided by 4, such that a PI of 6 actually points to 
byte 24 in the receiving buffer 510. 

The "CODEC Read Offset" ("CRO") reflects the read 
position with respect to the CODEC in the receiving buffer 
510. Analogous to the PI, the CRO is the actual CODEC 
read position divided by 4, such that a CRO of 1 actually 
points to byte 4 in the receiving buffer 510. In the illustrated 
embodiment, sample 1 contains 44 bytes of data without a 
delay in the system. Thus, CROj is 0 and PIj is 29, resulting 
in a 20 ms buffer length (14.5 ms plus 5.5 ms of sample 1). 
The 44 bytes of audio data sample 380 are placed in the last 
5.5 ms of the receiving buffer 510. 

In sample 2, the system experiences a 1 ms delay. The 44 
bytes of audio data sample 380 are placed adjacent to sample 
1 with PI 2 equal to 40. Since the audio data sample 380 is 
delayed 1 ms, CR0 2 equals 13, equating to a total of 6.5 ms. 
Thus, the difference between PI 2 and CR0 2 contracts to a 27 
position difference. Adding the 27 position difference 
between PI 2 and CR0 2 to the 44 bytes of audio data sample 
380 equates to a 19 ms window for sample 2. A 10 ms 
system delay is encountered by sample 3, leading to a 
contraction of the window to 10 ms. In samples 4 and 5, the 
system has compensated for some of the delay and, as a 
result, the length of the windows has increased as shown. As 
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previously discussed, the position identifier 370 represents 
an absolute position in the receiving buffer 510 regardless of 
the delay in the system. Furthermore, once the transport 
media is free after the extended delay associated with 
sample 3, samples 3-5 are immediately positioned in the 5 
receiving buffer 510 one after the other as shown. 

Sample 5 further illustrates the circumstance when a 
shortened audio data sample 380 is transmitted. Sample 5, 
which is only 24 bytes long, is inserted into the receiving 
buffer at Pl5=73. Since sample 5 is short by 20 bytes, the 10 
missing 20 bytes are filled with white noise, representing 
silence. The silence is not shown, as will be explained. 

Next sample 6 arrives. Sample 6 is a full-length packet of 
44 bytes. Thus, PI 6 equals 79. Sample 6 overwrites the 20 
bytes of silence that had been appended to the end of sample 15 
5. Since FIG. 5 already shows sample 6 in place, the silence 
is already overwritten and thus not shown. 

Finally, sample 7 displays the circumstance when the 
CODEC clock operates too slowly. For purposes of discus- 
sion, the CODEC clock is assumed to be grossly out of ^ 
frequency, such that the effect produced thereby is empha- 
sized. In such case, PI advances 5.5 ms or 11 positions from 
the previous PI to position 90 in the receiving buffer 510. 
However, the slow CODEC clock forces the CRO to lag. In 
this instance, the CRO only advances 5.0 ms or 10 positions 
from the previous CRO to position 60 in the receiving buffer 25 
510. The result is that the length of the window is 20.5 ms. 
Decimation is therefore required to shorten the receiving 
buffer 510 to the pre-set size. 

Decimation is performed in adjustment intervals as fol- 
lows: 1 byte for every 2 bytes away from the ideal window 30 
length (160 bytes, in the illustrated embodiment), 2 bytes for 
every 3 or 4 bytes away from the ideal window length and 
3 bytes for every 5 or 6 bytes away from the ideal window 
length. In this instance, the buffer is 0.5 ms too long, 
equating to 4 bytes. Accordingly, the decimation circuit must 35 
remove 2 bytes from the receiving buffer 510 to adjust the 
receiving buffer 510 window toward the ideal length. Inter- 
polation and decimation are ongoing processes in the system 
of the present invention. 

Before leaving FIG. 5, it should be noted that, if window 
length is reduced to zero (either by virtue of the non- 
transmission of periods of silence or by virtue of reception 
of multiple invalid packets), the CODEC 230 simply reads 
the white noise in the receiving buffer 510, thereby simu- 45 
lating silence, again for the benefit of the listener. 

Turning now to FIG. 6, illustrated is a flow diagram of the 
method of assembling a data packet according to the present 
invention. The packet assembly circuit constructs a data 
packet from a portion of a stream of digital audio data 50 
corresponding to an audio signal. As illustrated in the 
preferred embodiment, in a step 610, a sample of audio data 
are received into the packet assembly circuit. In a decisional 
step 620, the packet assembly circuit determines whether the 
sample represents silence or nonsilence by comparing the 55 
data therein to a predetermined threshold. If the data have a 
value less than the threshold, a packet is not generated, as it 
is of little value to occupy network bandwidth transmitting 
silence. If the data have a value equalling or exceeding the 
threshold, execution proceeds to a step 630, wherein the 50 
packet assembly circuit assigns the reserved/length field 
340, the reserved/CRC field 350, the channel identifier 360 
and the position identifier 370 to the audio data sample 380. 
The previously-described fields appended to the audio data 
sample 380 constitute the message 330. 55 

In a step 640 (only applicable in an Ethernet environ- 
ment), the Ethernet II header 310 and Ethernet II trailer 390 
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are affixed to the message 310. The Ethernet II header 310 
and Ethernet II trailer 390 contain information necessary to 
route the data packet through the computer network and to 
check the transmitted data for errors. In an ATM environ- 
ment, an ATM header is affixed to the packet 

In a step 650 (again, only applicable in an Ethernet 
environment), the data packet is evaluated for errors. If there 
is an error in the data packet, the process restarts, otherwise 
the process moves to a step 660. In the step 660, the data 
packet is queued for transmission across the backbone of the 
network. 

Turning now to FIG. 7, illustrated is a flow diagram of the 
method of disassembling a data packet according to the 
present invention. In a step 710, if the receiver accepts an 
invalid packet, the packet is disregarded and the disassem- 
bling process for that packet terminates in a step 720. 

In a step 730, assuming the packet is valid, the packet 
disassembly circuit strips the reserved/length field 340, the 
reserved/CRC field 350, the channel identifier 360 and the 
position identifier 370 from the audio data sample 380. In an 
Ethernet environment, the packet disassembly circuit also 
strips the Ethernet II header 310 and Ethernet II trailer 390. 

In a step 740, the packet disassembly circuit inserts the 
audio data sample 380 into an absolute location of the 
receiving buffer 510 (of FIG. 5) according to the value of the 
position identifier 370. The audio data sample 380 is thereby 
synchronized with adjacent audio data samples 380 of the 
stream of digital audio data in the receiving buffer 510 to 
compensate for the variable periods of transmission time. 

From the above, it is apparent that the present invention 
provides a system and method for communicating audio data 
in a packet-based computer network wherein transmission of 
data packets through the computer network requires variable 
periods of transmission time. The system comprises: (1) a 
packet assembly circuit for constructing a data packet from 
a portion of a stream of digital audio data corresponding to 
an audio signal, the packet assembly circuit generating a 
position identifier indicating a temporal position of the 
portion relative to the stream, inserting the position identifier 
into the data packet and queuing the data packet for trans- 
mission through a backbone of the computer network and 
(2) a packet disassembly circuit, having a buffer associated 
therewith, for receiving the data packet from the backbone, 
the packet disassembly circuit inserting the portion into an 
absolute location of the buffer, the position identifier deter- 
mining the location, the portion thereby synchronized with 
adjacent portions of the stream of digital audio data in the 
buffer to compensate for the variable periods of transmission 
time. 

Although the present invention and its advantages have 
been described in detail, those skilled in the art should 
understand that they can make various changes, substitu- 
tions and alterations herein without departing from the spirit 
and scope of the invention in its broadest form. 

What is claimed is: 

1. A system for communicating audio data in a packet- 
based computer network, transmission of data packets 
through said computer network requiring variable periods of 
transmission time, the system comprising: ' 
a packet assembly circuit for constructing a data packet 
from a portion of a stream of digital audio data corre- 
sponding to an audio signal, said packet assembly 
circuit generating a position identifier indicating a 
temporal position of said portion relative to said 
stream, inserting, said position identifier into said data 
packet and queuing said data packet for transmission 
through a backbone of said computer network; and 
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a packet disassembly circuit, having a buffer associated 
therewith, for receiving said data packet from said 
backbone, said packet disassembly circuit inserting 
said portion into an absolute location of said buffer, said 
position identifier determining said location, said por- 
tion synchronized with adjacent portions of said stream 
of digital audio data in said buffer to compensate for 
said variable periods of transmission time. 

2. The system as recited in claim 1 further comprising an 
interpolation circuit for inserting synthesized audio data into 
a designated location of said buffer to lengthen said portions 
of said stream of audio data in said buffer. 

3. The system as recited in claim 1 further comprising a 
decimation circuit for deleting audio data from a designated 
location of said buffer to shorten said portions of said stream 
of audio data in said buffer. 

4. The system as recited in claim 1 wherein said data 
packet comprises source and destination fields for determin- 
ing a transmission route of said data packet through said 
computer network. 

5. The system as recited in claim 1 wherein a value of said 
position identifier is a function of a length of a portion of 
said stream of digital audio data in a previously-transinitted 
data packet. 

6. The system as recited in claim 1 wherein a window of 
said buffer is about 20 milliseconds. 

7. The system as recited in claim 1 wherein said data 
packet is capable of containing a portion having a length of 
about 5.5 milliseconds. 

8. The system as recited in claim 1 further comprising a 
digital conversion/compression circuit, coupled to said 
packet assembly circuit, for digitizing and compressing said 
audio signal into said stream of digital audio data. 

9. The system as recited in claim 1 further comprising a 
decompression/analog conversion circuit, coupled to said 
packet disassembly circuit, for decompressing and convert- 
ing said stream of digital audio data back into said audio 
signal. 

10. The system as recited in claim 1 wherein said com- 
puter network comprises a plurality of computers coupled to 
said backbone, said packet assembly circuit and said packet 
disassembly circuit located in separate ones of said com- 
puters. 

11. A method of communicating audio data in a packet- 
based computer network, transmission of data packets 
through said computer network requiring variable periods of 
transmission time, the method comprising the steps of: 

constructing a data packet from a portion of a stream of 
digital audio data corresponding to an audio signal with 
a packet assembly circuit, said packet assembly circuit 
generating a position identifier indicating a temporal 
position of said portion relative to said stream, inserting 
said position identifier into said data packet and queu- 
ing said data packet for transmission through a back- 
bone of said computer network; and 

receiving said data packet from said backbone into a 
packet disassembly circuit having a buffer associated 
therewith, said packet disassembly circuit inserting said 
portion into an absolute location of said buffer, said 
position identifier deterrxnning said location, said por- 
tion synchronized with adjacent portions of said stream 
of digital audio data in said buffer to compensate for 
said variable periods of transmission time. 

12. The method as recited in claim 11 further comprising 
the step of inserting synthesized audio data into a designated 65 
location of said buffer to lengthen said portions of said 
stream of audio data in said buffer. 
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13. The method as recited in claim 11 further comprising 
the step of deleting audio data from a designated location of 
said buffer to shorten said portions of said stream of audio 
data in said buffer. 

14. The method as recited in claim 11 further comprising 
the step of determining a transmission route of said data 
packet through said computer network with source and 
destination fields in said data packet. 

15. The method as recited in claim 11 further comprising 
the step of assigning a value of said position identifier as a 
function of a length of a portion of said stream of digital 
audio data in a previously-transmitted data packet 

16. The method as recited in claim 11 further comprising 
the step of establishing a window of said buffer at about 20 
milliseconds. 

17. The method as recited in claim 11 further comprising 
the step of containing a portion having a length of about 5.5 
milliseconds in said data packet. 

18. The method as recited in claim 11 further comprising 
the step of digitizing and compressing said audio signal into 
said stream of digital audio data with a digital conversion/ 
compression circuit coupled to said packet assembly circuit 

19. The method as recited in claim 11 further comprising 
the step of decompressing and converting said stream of 
digital audio data back into said audio signal with a decom- 
pression/analog conversion circuit coupled to said packet 
disassembly circuit 

20. The method as recited in claim 11 wherein said 
computer network comprises a plurality of computers 
coupled to said backbone, said method further comprising 
the step of locating said packet assembly circuit and said 
packet disassembly circuit in separate ones of said comput- 
ers. 

21. A packet-based computer network, comprising: 

a backbone coupling, and for communicating packeted 
data between, first and second computer nodes, serial 
transmission of data packets through said computer 
network requiring variable periods of transmission 
time; 

means, coupled to said first node, for receiving an original 
audio signal and generating therefrom a corresponding 
stream of digital audio data; 

a packet assembly circuit, associated with said first com- 
puter node, for constructing data packets from portions 
of said stream of digital audio data, each of said data 
packets including: 
one of said portions, and 

a position identifier indicating a temporal position of 
said one of said portions relative to said stream, said 
packet assembly circuit queuing said data packet for 
serial transmission to said second node through said 
backbone; 

a packet disassembly circuit, associated with said second 
computer node and a buffer, for serially receiving said 
data packets from said backbone, said packet disassem- 
bly circuit disassembling each of said data packets by: 
inserting said portion into an absolute location of said 
buffer, said position identifier determining said loca- 
tion, said portion synchronized with adjacent por- 
tions of said stream of digital audio data in said 
buffer to compensate for said variable periods of 
transmission time; and 
means, coupled to said second node, for generating a 
reconstructed audio signal from said stream of digital 
audio data in said buffer. 

22. The network as recited in claim 21 further comprising 
an interpolation circuit for inserting synthesized audio data 
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imo a designated location of said buffer to lengthen said 
portions of said stream of audio data in said buffer. 

23. The network as recited in claim 21 further comprising 
a decimation circuit for deleting audio data from a desig- 
nated location of said buffer to shorten said portions of said 5 
stream of audio data in said buffer. 

24. The network as recited in claim 21 wherein said data 
packet comprises source and destination fields for designat- 
ing said first node as a source and said second node as a 
destination of said data packet. 10 

25. The network as recited in claim 21 wherein a value of 
said position identifier is a function of a length of a portion 
of said stream of digital audio data in a previously-trans- 
mitted data packet. 

26. The network as recited in claim 21 wherein a window 15 
of said buffer is about 20 milliseconds. 

27. The network as recited in claim 21 wherein said data 
packet is capable of containing a portion having a length of 
about 5.5 milliseconds. 

28. The network as recited in claim 21 wherein said 20 
receiving means comprises a digital conversion/compres- 
sion circuit, coupled to said packet assembly circuit, for 
digitizing and compressing said audio signal into said stream 

of digital audio data. 

29. Hie network as recited in claim 21 wherein said 25 
generating means comprises a decompression/analog con- 
version circuit, coupled to said packet disassembly circuit, 
for decompressing and converting said stream of digital 
audio data back into said audio signal. 

30. The network as recited in claim 21 wherein said 30 
packet assembly circuit and said packet disassembly circuit 
are embodied in preprogrammed general purpose data pro- 
cessing and storage circuitry. 

31. A method, for communicating packeted data over a 
backbone coupling first and second computer nodes of a 35 
packet-based computer network, serial transmission of data 
packets through said computer network requiring variable 
periods or transmission time, said method comprising the 
steps of: 

receiving an original audio signal at said first computer 40 
node and generating therefrom a corresponding stream 
of digital audio data; 

constructing data packets from portions of said stream of 
digital audio data with a packet assembly circuit asso- 
ciated with said first computer node, each of said data 45 
packets including: 
one of said portions, and 

a position identifier indicating a temporal position of 
said one of said portions relative to said stream, 
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said data packet queued for serial transmission to said 
second computer node through said backbone; 

serially receiving said data packets from said backbone 
with a packet disassembly circuit associated with said 
second computer node and a buffer, said packet disas- 
sembly circuit disassembling each of said data packets 
by: 

inserting said portion into an absolute location of said 
buffer, said position identifier detennining said loca- 
tion, said portion synchronized with adjacent por- 
tions of said stream of digital audio data in said 
buffer to compensate for said variable periods of 
transmission time; and 

generating a reconstructed audio signal from said 
stream of digital audio data in said buffer. 

32. The method as recited in claim 31 further comprising 
the step of inserting synthesized audio data into a designated 
location of said buffer to lengthen said portions of said 
stream of audio data in said buffer. 

33. The method as recited in claim 31 further comprising 
the step of deleting audio data from a designated location of 
said buffer to shorten said portions of said stream of audio 
data in said buffer. 

34. The method as recited in claim 31 further comprising 
the step of designating said first node as a source and said 
second node as a destination of said data packet with source 
and destination fields in said data packet. 

35. The method as recited in claim 31 further comprising 
the step of assigning a value of said position identifier as a 
function of a length of a portion of said stream of digital 
audio data in a previously-transmitted data packet 

36. The method as recited in claim 31 further comprising 
the step of establishing a window of said buffer at about 20 
milliseconds. 

37. The method as recited in claim 31 further comprising 
the step of containing a portion having a length of about 5.5 
milliseconds. 

38. The method as recited in claim 31 wherein said step 
of receiving comprises the step of digitizing and compress- 
ing said audio signal into said stream of digital audio data. 

39. The method as recited in claim 31 wherein said step 
of generating comprises the step of decompressing and 
converting said stream of digital audio data back into said 
audio signal. 

40. The method as recited in claim 31 further comprising 
the step of embodying said packet assembly circuit and said 
packet disassembly circuit in preprogrammed general pur- 
pose data processing and storage circuitry. 

***** 
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